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An address generator provides for generation of addresses 
for a plurality of different tests by allowing for primitive 
polynomial-based pseudo-random bit-streams to be shifted 
into the address generator. Embodiments of the present 
invention utilize the address values to generate data values 
to be stored in a memory under test. Likewise, an expected 
data value is generated and compared to the stored value. A 
data comparator verifies the stored data to the expected 
value. A single latch stores compare results for a plurality of 
memory locations. 
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SYSTEM AND METHOD FOR TESTING BRIEF DESCRIPTION OF THE DRAWINGS 

INTEGRATED MEMORIES CTr t .„ t , , . . . . 

FIG. 1 illustrates a prior art device having integrated 

memories; 

FIELD OF THE INVENTION $ FIG , 2 illustrates, in block form, an integrated device in 

The present invention generally relates to the testing of accordance with the present invention; 

integrated memories, and more specifically to the testing of FIG. 3 illustrates, in block form, a custom memory having 

memories having test collars. an associated test collar; 

BACKGROUND OF THE INVENTION * illustrates, in logic schematic form, an address 

. 10 generator in accordance with the present invention; 

•. C ° mP ». mtegra H , CV,CeS ' SUCb 38 FIG. 5 illustrates, in logic schematic form, a data genera- 
microprocessors, ,s necessary m order to assure proper to[ fa accordance ^ „» , ^^o* 

operation of these devices. For integrated devices that have _„ - .„ . . . . , 

associated memory blocks, it is necessary to assure tunc FI t G ' 6 dlustr f s > m ^ schematic form, a data com- 

tionality of the memory. One prior art method of testing parator m accordaDce Wlth the P resent invention; 

integrated memory blocks is to pin-out, either directly or 1 FIG 7a lUustrates a 4 ' Dit MISR; 

through multiplexors, the address, control and data pins of FIG. lb illustrates a middle bit-slice using a SEDFFTR; 

the memory. In this manner, it is possible to exercise all an d 

storage locations of the memory to ensure proper operation. FIG. 8 illustrates a flow diagram in accordance with the 

However, the pin count needed to test a device in this 2Q present invention. 

manner can be larger than the number of pins available. In _ „ _ _ 

addition, for each memory multiplexed in this manner, DETAILED DESCRIPTION OF THE DRAWINGS 

additional logic is introduced into the delay paths of the FIG. 2 illustrates a specific embodiment of a system 200 

device, resulting in slower input/output (10) propagation having integrated memory in accordance with the present 

times that can affect performance. ^ invention. The system 200 includes memories 221-225. The 

Prior art FIG. 1 illustrates another prior art solution for memories 221-225 are illustrated as having various heights 

testing integrated memories. Specifically, a Built-in Self and widths in order to represent varying address space sizes 

Test (BIST) controller is used to automatically verify func- and word widths. For example, memories 221 and 222 can 

tionality of individual blocks of memory. For example, on each represent memories having 8-bit words stored within a 

start up, the BIST controller will perform a test routine to w 256-word memory space. While memory 223 can represent 

verify the integrity of the memory. If errors are found, they a m ^ havm S 128 ' Dl1 words stored WIlhm a 1024 < 1K ) 

are reported. word memor y s P ace ' 

FIG. 1 illustrates a device 100 having two BIST control- Each memory 221-225 has an associated test collar. The 

lers for testing three integrated memories. The portion 100 test collars can be enabled individually based upon a test 

ri-r^i • i i_i i r u • t * j mode selection indicator provided by the Test Mode Con- 

of FIG. 1 represents a single block of memory being tested « . „ ~- A r . . . „ r . t 3 . 

u dtct . 11 r- ii *u dtct / ii i j tToller 250. Each test collar is further connected to a com- 

by a BIST controUer. Generally, the BIST controUer includes mon Test bus m Test Control bus m ides tfae 

a test collar which, in conjunction with the BIST controller, contro] signals necessary to execute specif f c types of 

generates addresses, data values, and compares received mem ory testing. Generally, the test to be performed will be 

results to expected results. The portion 120 of FIG. 1 a function of both the signal from the Test Mode Controller 

represents a double block of memories being tested by a 40 250, and the actual control signal provided via Test Control 

single BIST controller. The use of BIST technology dedi- bus 211. Results from the various memories are provided 

cates BIST hardware to specific memory locations. As a through the multiplexor 260. Specifically, each Collar 

result, when a device has many different sizes or kinds of 231-235, or a portion of the Collars 231-235, shares output 

memory devices it is necessary for separate BIST controllers pads through a multiplexor. As illustrated in FIG. 2, the 

and associated collars to be provided. This increases the size 45 multiplexor 260 includes individual multiplexors 261 and 

of a memory block in the range of 2% to 8%. 262. Individual multiplexor 262 selects a signal from one of 

Testability techniques can be implemented using modem Collar 231-235 to provide to the output of device 200. 

simulation and layout tools. However, these tools only Individual multiplexors are connected to those collars hav- 

perform fixed test algorithms. For example, these tools are m g a second output. 

only capable of implementing March algorithms to detect 50 In operation, a user provides a TEST MODE signal to 

errors. March algorithms read and write data in an up select one of the memories of device 200 for testing. The 

direction (incrementing address values) or in a down direc- TEST MODE signal can be responsible for merely selecting 

tion. While such tests are good enough for establishing a memory, as well as be responsible for specifying a specific 

production testing of traditional sized memories, sufficient test. Once a memory is selected using the TEST MODE 

test coverage is not available for development and debug- signal, the Test Control bus 211 is used to control the 

ging purposes. 55 implementation for the test. The Test Control bus 211 signal 

Conventionally available tools and techniques for testing can also define, or further define the test to be performed, 

integrated memories do not have the flexibility to test wide For example, tests capable of being performed in accordance 

varieties of memories. For example, modem graphics with the present invention include March-2 test, March-3 

devices utilize small word sizes (8-bits or less) and very te st, March-4 test, March-5 test, Butterfly test, and the 

large word sizes (128-bits or more). Within these ranges, the 60 Galpat test. 

address space can be a few bytes or Kilobytes. However, a The Test Control bus 211 can also be used to freeze a 

problem with commercially available tools is that they are memory by writing a value into a latch associated with the 

optimized to efficiently support only the most common device that disables the memory's write enable. For 

memory types and sizes. example, a latch in each memory's test collar can be set by 

Therefore, a system and method for testing integrated 65 the test control signals, 

memories that overcomes these problems would be desir- FIG. 3 illustrates in greater detail, a memory array, or 

able. macro, and its associated collar. In a general sense, the pins 
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of any kind of memory macro can be viewed as belonging Theoretically, the remaining 4 functions (#1, #2, #3, #4) 

to one of three groups: the data pins, the address pins, and can also be implemented by connecting the first 4 categories 

the control pins. For the custom memories being designed of test pins to 4 chip-wide test -busses. However by serially- 

for specific products, the associate pins can be broken down transmitting and/or locally-generating the bit patterns 

more specifically. For example, it is possible to implement 5 required for the remaining 4 functions, the cost of globally 

memories having only unidirectional data pins, and separate routing the additional lines is avoided, 

sets of address pins for read and write operations, with the There are many test algorithms used to verify memory 

control pins allowing for concurrent read and write opera- functionality. The ability to control and observe integrated 

tions. Therefore, for a specific memory implementation, memory devices increases the complexity of testing inte- 

there are 7 categories of pins: 10 grated memories. The most efficient memory test 

algorithms, in terms of fault coverage, are in the family of 

(1) the data input pins, also called "write data" pins, so-called "march tests". A march test is composed of a 

(2) the data output pins, also called "read data" pins, sequence of "march elements", where each march element is 

(3) the write address pins a "F° r_ Loop" that spans the complete address-space in either 
* is " U P" or "d own " direction, and performs the same sequence 

(4) the read address puis, of read ^d/or write operations at each address location. 

(5) the write control pins, usually just a "write clock" and Here is an example of a very common "march test", written 
a "write enable", in a very compact, customary notation: 

(6) the read control pins, usually just a "read clock", a 

"read enable" and an "extended sense", 20 

(7) the test control pins, usually just TMEM and TSCAN, March-i 
which represent "memory test mode" and "scan test "^q) 
mode", respectively. j^rO, w i) 

Some of these pin categories (specifically #1, #3, #4, #5 fr(rl, wO) 

and partly #6) can have "twinned pins", where each pin has 25 wi) 

a "proper pin", which is used during normal operation, and * r * ^ 
a "test pin", which is used during a special memory test 

mode. Inside the custom memory macro, each pair of The algorithm above assumes a "single-bit" type of 

"twinned" pins is driving a 2-to-l multiplexor which memory, where each bit has a distinct, different address, 

chooses whether to use the "proper pin" or the "test pin". All 30 There are 5 "march elements" shown above, of which the 

of these 2-to-l multiplexors are controlled by a single select first 3 are m tne " U P" direction, and the last 2 are in the 

signal, namely TMEM. "down" direction. The "up" direction can be the obvious 

In order to support the special memory test mode men- "increasing" order, from 0 to N-l, where N is the total 

tioned above, a test collar is used. The test collar comprises n jF* er °* addre **s> or it can be any arbitrary permutation 

extra circuitry that surrounds each custom memory, and is ^ ° f al . addresses. The only restriction is that the down 

used for test purposes only. The components of the test collar directl0n ^ st * G ^ ^reverse of the up addressing 

, / 1 ..I .1 *7 * * c sequence. The write operations are represented by wO and 

have a one-to-one correspondence wuh the 7 categories of ^ where ^ P and ^ ^ ^ 

pins described above. Each collar must conta.n crcu.try to a ^ ted da(a ifie 6 d are re ted b 

perform the following functions: ^ ^ and fl whefe ^ jndicates tha , 0 ^ lhe expected read 

(1) data input generation, value, and rO indicates that the expected read value is a 1. 

(2) data output comparison with expected responses, Most integrated memories have multi-bit words of data 

,, x ' that are written to or read from each address location. If the 

(3) write address generation, aboye ^ fa by simply repladng ^ 

(4) read address generation, single bit 0 and single bit 1, with whole words of all O's or 

(5) write control generation, 45 all l's, then certain coupling faults between storage cells in 

sk\ ro-A ~ »™i the same word would fail to be detected. There are several 

(o) read control generation, ... . , - . , 

v ' & specific embodiments, or schemes, for extending a single-bit 

(7) test mode control. march test to a multi-bit march test. The goal of the various 

Some of these functions are implemented with global embodiments is to obtain high fault coverage. Each scheme 

signals that are routed chip-wide and illustrated in FIG. 3. 50 creates different data patterns to replace the single bit 0 and 

Function #7 (test mode control) is implemented by the 2 single bit 1. The schemes are named according to the type of 

global signals TMEM and TSCAN. Function #6 (read replacement data patterns, namely: (1) primary, (2) 

control generation) is implemented by the global signal serialized, and (3) address-based. 

TREN, by effective control of the clock tree connected to For a word with b bits, the number of primary data 

RCLK, and by appropriate control over EXTSENSE. Func- 55 patterns equals 2fl"log 2 bl+l). The primary data patterns are 

tion #5 (write control generation) is implemented by the a11 the periodic power-of-2 patterns and the all O's and all 1 's 

global signal TWEN, and by connecting TWCLK to the patterns. For b-4, the "primary" data patterns are shown 

same clock tree as RCLK, which is already effectively under below: 
external control. Note that the same clock tree, generally the 

Read clock (RCLK), is used for both read and write opera- 60 
tions during test mode. In summary, for the custom memory 
macro of FIG. 3, the functions #7, #6, and #5 are imple- 
mented by 4 global signals, 1 controllable clock tree, and 
possibly 1 additional global signal for EXTSENSE. A few 

custom memories have more than one write port or more 65 
than one read port, which may increase the number of global 
signals needed to implement functions US and/or #6. 
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For b=16, the "primary" data patterns are shown below: 



"0" replaced by 


"V* replaced by 


0101010101010101 


1010101010101010 


0011001100110011 


1100110011001100 


0000111100001111 


1111000011110000 


0000000011111111 


1111111100000000 


0000000000000000 


1111111111111111 



The "primary" data pattern equivalent of the March- 1 test 
is obtained by repeating March-1 for each complementary 
pair of patterns that, replace the single bit 0 and 1. For b=4, 
March-1 is repeated 3 times: 



20 



March-2 
(4-bit version) 

ft(w0101) 
ft(r0101, wlOlO) 
ft(rl010, wOlOl) 
4(rO101, wlOlO) 
U(rl010, wOlOl) 
tT(wOOll) 
ft(rO011, wllOO) 
ft(rllOO, wOOll) 
U(rO011, wllOO) 
U(rllOO, wOOll) 
Ir(wOOOO) 
ft(rO000, wllll) 
ft(rllll, wOOOO) 
JKrOOOO, wllll) 
^(rllll, wOOOO) 



6 

-continued 



March-2 
(16-bit version) 



10 



15 



ft(rllllllllOO00OO0O, wOOOOOOOOllllllll) 
4(r0000000011111111, wllll 111100000000) 
U(rllllllllOOOOOOOO, wOOOOOOOOllllllll) 

n(woooooooooooooooo) 

lr(rOC0OO0O000000000, wllllllllllllllll) 
fr(rllllllllllllllll,wO00O0000O00OO000) 
^(rOOOOOOOOOOOOOOOO, wllllllllllllllll) 
^(rlllll 11111111111, wOOOOOOOOOOOOOOOO) 



There are a few mathematically-definable coupling faults, 
within the same multi-bit word, that the above March-2 tests 
would not be able to detect. All physically-realistic coupling 
faults would be caught, however. In the interest of detecting 
physically-realistic faults, it is common to use only the 
25 unshaded sections in March-2, which employ only the 
highest frequency pattern pairs, and the completely constant 
pattern pairs. The shaded middle pattern pairs are much less 
likely to detect any additional faults in the cell array. 
However, if the memory macro uses multiplexing between 
30 the storage cell array and the read/write data ports, then 
these shaded patterns provide extensive additional fault 
coverage for the multiplexors themselves. 

In understanding serialized data patterns, consider a 
16-bit cell array arranged in a 4 by 4 square. If we apply 
March-1 to this cell array, we get the following behavior at 
the start of the second march element: 



35 



0000 1000 1100 1110 1111 1111 1111 1111 1111 1111 1111 

0000 0000 0000 0000 0000 1000 1100 1110 1111 1111 1111 

0000 0000 0000 0000 0000 0000 0000 0000 0000 1000 1100 

0000 0000 0000 0000 0000 0000 OO00 0000 0000 0000 0000 



For b=16, March-1 is repeated 5 times: 



March-2 
(16-bit version) 

ft(w0101010101010101) 

ft(rO101010101010101, W1010101010101010) 

ft(rl010101010101010, WOIOIOIOIOIOIOIOI) 

U(rO10101010101O101, W1O1O101O101O101O) 

JKrlOlOlOlOlOlOlOlO.wOlOlOlOlOlOlOlOl) 

ft{w0011001100110011) 

ft(rO011 00 11001 10011, wllOOl 100110011 00) 

fr(rll00110011001100, w0011001100110011) 

4(r0011 001100110011, wll 001 1001 1001 100) 

^(rllOOHOOllOOllOO, wOOllOOllOOHOOll) 

ft(wOOOOllllOOOOllll) 

ft(rO00011 1100001111, wl 111000011110000) 

ft(rllllOOO011110O00, wOOOOllllOOOOllll) 

^(rOOOOl 111000011 11, wllllOOOOHUOOOO) 

4l(rl 11 1000011 110000, wOOOOllllOOOOllll) 

tr(wOOOOOOOOllllllll) 

1r(rO00000001 11 11 111 , wll 111 1 1 1 00000000) 



50 



This cell array could also be viewed as 4-word array, with 
4 bits per word. In that case, the 1-bit march element t(rO, 
wl) becomes a much longer 4-bit march element t(rOOOO, 
wlOOO, rlOOO, wllOO, rllOO, wlllO, rlllO, wllll). The 
serialized version of March-1, for a 4-bit word-size, is 
shown below: 



55 



60 



March-3 (4-bit version) 



ft(wOOOO) 

1r(rO0O0, wlOOO, rlOOO, wllOO, rllOO, wlllO, rlllO, wllll) 
t(rllll, wOlll, rOlll, wOOll, rOOll, wOOOl, rOOOl, wOOOO) 
IKrOOOO, wlOOO, rlOOO, wllOO, rllOO, wlllO, rlllO, wllll) 
JKrllll, wOlll, rOlll, wOOll, rOOll, wOOOl, rOOOl, wOOOO) 



March tests have a run-time that is linear with respect to 
the number of addresses, but which is multiplied by a factor 
representing the number of bits per word. That word-size 
65 factor is 0(b) for serialized patterns. For primary patterns, 
the word -size factor is 0(log 2 b), if all primary patterns are 
used, or 0(1), if only the unshaded patterns are used. For 
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very wide data -word memories, the run-time of march tests 
using serialized patterns is much greater than march tests 
using primary patterns. 

For the previous 2 types of patterns, there is no restriction 
relating the number of bits per word, b, and the number of 
bits used for addressing, n. (If the total number of addresses 
is N, then n-[log 2 N].) If our custom macros satisfy the 
condition that b>=n, then there is a third class of patterns, 
address-based data patterns, which may be employed, which 
are based on using the n-bit address pattern within the b-bit 
data word. For simplicity, and with no loss of generality, in 
the following discussion, assume that b=n. Let A represent 
the binary address pattern, and let A represent the one's- 
complement of the binary address pattern. Then March-1 
can be extended to multi-bit words, as shown below: 



10 



15 



8 



data with the even-odd data patterns, is to perform some data 
transformation conditional upon the least -significant address 
bit(s). When the address LSB is 0 (an even address), we may 
choose one operation, and when the address LSB is 1 (an 
odd address), we choose the other. This may create a 
checkerboard effect in some multi-bit-per-word storage cell 
arrays. We add even/odd subscripts to the read/write opera- 
tions to specify this conditional operation. In the case of a 
single-bit-per-word storage cell array, we calculate the parity 
of 2 LSBs (the physical row address LSB and the physical 
column address LSB) to determine the "evenness or odd- 
ness" of an address. Two examples are given below: first, the 
checkerboard version of March-1, and second, the checker- 
board version of March-5. 



March-4 

ft(wA) 
ft(rA, wA) 
ft(rA, wA) 



You will notice that the 2 "down" direction march ele- 
ments have been deleted. Usually, march tests are written in 
a symmetrical fashion, where all the "up" direction march 
elements (except for the obvious initialize-to-all-O's first 
march element) are repeated as "down" direction march 
elements. The purpose of this repetition, using both address- 
ing orders, is to detect address decoder faults. But if the 
binary-contents of each address location is now unique, then 
we can detect those same address decoder faults by travers- 
ing the addresses in one order only. This scheme actually 
reduces the run-time of the 1-bit version, but only when 
b>«n. 

To detect coupling faults within each data-word, we can 
perform a bit-wise exclusive- OR operation on each address 
pattern with the primary data patterns. To detect only the 
physically-realistic coupling faults, we only need the very 
first and very last of the primary patterns. The very first 
primary pattern is also known as the "even-odd" pattern, and 
the very last primary pattern (all 0's and all l's) can be 
interpreted as a mathematical identity operation. Let 0 
represent the bit-wise exclusive-OR operation, then let 

A-A0OOOOOOOO ... 

A=AeilllllH . . . 

A-AQ0101101 . . . 

A=A©10101010 . . . 

Now we can expand March-4, in the style of March-2 
(unshaded), as follows: 



March-5 

IT(wA) 
1r(rA, wA) 
fr(rA", wA) 
ft(wA) 
ft(rA, wA) 
tt(rA, wA) 



Obviously, the remaining (shaded) primary patterns can 
also be bit -wise exclusive-ORed with the address pattern A, 
to provide multiplexor fault coverage, where applicable. 

All of the re-formulations of March-1 shown above, deal 
with the replacement of the single data bit with multi-bit data 
patterns. A frequently useful variation, inspired by EXORing 



Marcb-6 

20 1t(w e 0|w o l) 

1f(r e 0|r o l, w e l|w o 0) 
1r(r e l|r o 0, w^K^) 
U(r e 0|r o l, w e l|w o 0) 
U(r e l|r o 0,w e 0|w o l) 

2^ March-7 

1t(w e A|w 2 A) 
IKvMr.A, w eA| w oA) 
tKr.Ajr^, W.AKS) 
tKw^Ajw^) 
IKvMr^, w^Ajw^A) 
30 1r(r<A|r<A W.AKA) 



Fancier checker boarding is possible by conditioning 
operations upon more than just one or two "least-significant" 

35 address bit(s). 

The run-time of march tests is linear in the number of 
addresses, 0(N). There are further types of tests, such as 
"butterfly tests" whose run-time complexity is 0(N-log 2 N), 
"galloping row/column tests" whose run-time complexity is 

40 0(NVN), and "galloping pattern tests" whose run-time 
complexity is OQti 2 ). The names galrow, galcol, and galpat 
are commonly used to refer to "galloping row", "galloping 
column" and "galloping pattern", respectively. All the gal- 
loping and butterfly tests introduce a second level of nested 

45 "For-Loops" into the march elements* first level "For- 
Loop". The difference among the galloping and butterfly 
tests is in how much of the address-space is spanned by the 
inner "For-Loops". The galpat tests' inner "For-Loop" spans 
the entire address-space, hence the 0(1^) complexity. The 

50 galrow and galcol tests' inner "For-Loop" spans only 
addresses located within the same physical row or column of 
the storage cell array, hence the square-root within the 
0(NVN) complexity. The butterfly tests' inner "For-Loop" 
spans only addresses that are a Hamming-distance of 1 away 

55 from the current address in the outer (march element's) 
"For-Loop", hence the base-2 logarithm in the 0(N log 2 N) 
complexity. 

Since the march test notation is insufficient to represent 
the actions of the galloping and butterfly tests, we introduce 

60 some commonly used notation below. Since there are now 2 
levels of "For-Loops", we must name the "Loop -Index" of 
each one, in order to simplify our discussion. The outer 
Loop-Index is usually called the "base -address" or "base- 
cell" (also known as the "home address"). The inner Loop- 

65 Index is usually called the "moving-address" or "moving- 
cell" (also known as the "away address"). March elements 
are, by definition, looping with the "base-cell" exclusively. 
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Here is the compact notation for an arbitrary inner loop, with The moving-cell only takes on values that are a Hamming- 
its pseudo-code equivalent: distance of 1 away from the base-cell. 

ra • wb rb • \ / • rb\ 



(ra • wb rb • \ i • rb\ 

• rc • • rc) 5 \ra • I 



for moving-cell=0 to N-l, excluding the address of base-cell 
if stored-value[moving-cell]*a, then error(moving-cell) 
if stored-value[base-cell]*c, then error(base-cell) 
write-to[moving-cell]«-b 

if stored-value[moving-cell]*b, then error(moving-cell) 
if stored-value[base-cell]*c, then error(base-cell) 
endfor 

The upper row specifies "moving-cell" operations, and 
the lower row specifies "base -cell*' operations. Almost 
always, the "base-cell" operations within inner loops are 
restricted to read operations only. The "moving-cell" may be 
written or read. One of the simplest and most widely known 
versions of (single-bit) galpat is shown below, first in 
compact notation, then in pseudo-code: 



Galpat- 1 


ft {w0) 








ft (vW) 




ft(wO,r0,^ 


» 



for base-cell=0 to N-l 

write-to[base-cell]«-0 
endfor 

for base-cell=0 to N-l 
write-to[base-cell]«-l 

if stored-value[base-cell]*l, then error(base-cell) 
for moving-cell=0 to N-l, excluding the address of 
base-cell 

if stored-value[moving-cell]*0, then error(moving- 
cell) 

if stored-value[base-cell]*l, then error(base-cell) 
endfor 

write-to[base-cell]<-0 
endfor 

for base-cell«0 to N-l 

write-to[base-cell]«-l 
endfor 

for base-cell=0 to N-l 
write-to[base-cell]*-0 

if stored-value[base-cell]*0, then error(base-cell) 
for moving-cell=0 to N-l, excluding the address of 
base-cell 

if stored-value[moving-cell]*l, then error(moving- 
cell) 

if stored-value[base-cell]?iO, then error(base-cell) 
endfor 

write-to[base-cell]«-l 
endfor 

Obviously, this version of galpat would be converted to a 
multi-bit version, for use with our custom memories. 

A butterfly test uses a similar notation, except that the 
delimiters are not parentheses but angle-brackets instead. 



for walking-l-bite{10000000, 01000000, 00100000, 
00000010,00000001} do: moving-cell=base-cell®walking- 
1-bit 

if stored-value[base-cell]*a, then error(base-cell) 
if stored-value[moving-cell]*b, then error(moving-cell) 
endfor 

Here is the most common version of the (single-bit) 
butterfly test: 



Butterfly- 1 
ft(vvO) 



This test is designed to detect stuck-open faults that cause 
sequential behavior in the address decoders, which march 

25 tests are unable to detect. 

Now that we have sufficient background material about 
the different types of memory tests that we may use, we may 
proceed with descriptions of circuits that support the opera- 
tions needed by these tests. 

30 If March-2 and March-3 tests must be supported, because 
b<n, then both an "up" and a "down" address generation 
capability are needed. Specific embodiments capable of 
supporting this include a binary up-down counter, a 
bi-directional LFSR (linear feedback shift register), or two 

35 shift-registers (shifting in opposite directions) with a 2-to-l 
multiplexor merging their outputs. For custom memories 
that have non-power-of-2 addressing, the binary up-down 
counter is the clear favorite. The two shift- registers with 
mux is an attractive circuit because of its simplicity and 
high-speed. Each shift-register circuit works by imitating an 

40 "LFSR with one external (wide-fanin) EXOR-gate feeding 
the serial-input". The shift-register is more flexible than an 
actual LFSR because we are not committed to using just one 
primitive polynomial to generate the addressing sequence. 
If we can use tests like March-4 and March-5, because 

45 b>«n, then we require smaller circuits. For example, either 
(1) a binary up-only counter, or (2) a unidirectional LFSR, 
or (3) one shift-register can be used. Again, where non- 
power-of-2 addressing is used, the binary up-only counter is 
the clear favorite, although the single shift-register is attrac- 

50 tively simple. 

In order to support more than just march tests, such as 
multi-bit versions of Butterfly-1 and Galpat-1, the address 
generator will contain, at least: (a) a multiplexor for each 
address bit, (b) an EXOR-gate per bit, and (c) two sources 

55 of addresses, of which at least one must be a shift-register, 
and the other may be either a binary counter, or an opposite- 
direction shift-register. If b<n, then the binary counter will 
have up/down counting ability, else if b>=n, then it can be 
an up-only counter. The relation of b to n is irrelevant if the 

60 second address source is the opposite-direction shift- 
register. 

FIG. 4 illustrates a specific embodiment of a 3-bit version 
of an address generator that supports all possible March, 
Butterfly and Galpat tests, using only shift-registers as 
65 address sources. 

In implementing FIG. 4 at the chip-level, RCLK is 
already accounted for as illustrated in FIG. 3. The 
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TADR0-TADR2 pins coupled to multiplexers 440, 441, 442 
respectively, belong to the custom memory itself. Therefore, 
an extra 5 global signals are needed for this particular 
implementation of an address generator, namely: SHIFT/ 
HOLD-UP, SERIAL-IN-UP; SHIFT/HOLD-DOWN, 5 
SERIAL-IN-DOWN, USE-UP/DOWN. If the up-direction 
shift register 410 were replaced by a binary up-only counter, 
then the same number of extra global signals would still be 
needed, however some names would be changed: RESET- 
COUNTER would replace SERIAL-IN-UP, and COUNT/ in 
HOLD-UP would replace SHIFT/HOLD-UP. If Butterfly 10 
and Galpat tests are not supported, then only 2 global signals 
are needed, namely the 2 belonging to the up-direction 
shift-register 410, or the up-only counter (assuming b>*n; or 
3 global signals for the up-down counter, assuming b<n, 
since another signal is needed to specify the direction of 15 
counting). 

The 2-to-l multiplexors 440-442 select between the 
up-direction shift-register, or the exclusive-OR 430, 431, 
432 of the two shift-registers 410, 420. If the down-direction 
shift-register 420 is being used during a Butterfly or Galpat 20 
test, then this exclusive-OR is exactly the functionality that 
we want. If the down-direction shift-register 420 is being 
used as the down-direction address source during a March 
test, then this will work correctly, if and only if, the 
up-direction shift register 410 is loaded with all 0's. But 25 
ending an LFSR-style address sequence on all 0's is easy to 
do, so this is not a problem, and it saves: the extra area of 
widening the multiplexors to 3-to-l, plus a sixth global 
signal since the wider mux would need 2 select signals 
instead of 1. 

In FIG. 3, two address generators are shown: one for 
writing and the other for reading. For most custom 
memories, it will be possible to share one address generator 
for both write and read address ports. Two conditions must 
be satisfied to allow sharing to occur: (1) read and write 
operations must be pipelined by exactly the same number of 
clock cycles, and (2) the read and write ports must be close 
enough physically to justify the extra routing cost. Hence, if 
read and write operations are each pipelined by a different 
number of clock cycles, then there must be 2 independent 40 
address generators, each with its own set of 5 global signals 
(or 2 global signals, for March tests only, when b>=n; or 3 
global signals, for March tests only, when b<n). To under- 
stand the implications of pipelined operations within a 
March test, consider the unpipelined and fully-general pipe- 45 
lined examples below: 



30 



35 



March -8 
(unpipelined 



ft (wO, rO) 
ft(ri>. "A rl) 
ft (rl t w0, rO) 
U(r0, wl t rl) 
U(r/, \»0,r0) 



pipelined Writes 


pipelined 


pipelined 






Reads 


Compares 


50 


ft (H0, -) 


ft (-, n 


ft(-.cO) 


ft (-, W, -) 




ft (c0, cl) 




ft (-, wO, -) 


ft (r\ -, n 


ft (c7, cO) 




U (-, w/, -) 




U (c0, cl) 




U (-, w0 t -) 




U (ch c0) 


55 



This example shows 3 March tests running concurrently, 
the Write column containing writes and no-ops, the Read 
column containing reads (without expected values) and 
no-ops, and the Compare column containing comparisons 
with expected data and no-ops. This example exposes the 
hidden complexity of read operations, when pipelining is 
introduced. Write operations are very neat, by comparison 
with reads, because both the destination address and the 
destination data must be ready when the write enable signal 
is asserted. A complete read operation refers to 2 separate 



60 



65 



events in time: (1) the desired address must be ready when 
the read enable signal is asserted, and (2) the desired data 
shows up some time later. Hence, there are actually 3 
latencies that we care about: (1) the time required to decode 
the write address and store the write data, (2) the time 
required to decode the read address, and (3) the time 
required to obtain the read data. 

While the differential latency between the Write and Read 
address ports can be any arbitrary integer amount, memories 
having zero differential latency between the Write and Read 
address ports can share the same address generator. In this 
case, the pipelined march test simplifies to the form shown 
below. 



pipelined 
Writes/Reads 


pipelined 
Compares 


ft (w0, O 
ft (r*, wl, r') 
ft (r\ ntf, r') 
U (r\ >v/, /■*) 


ft (-, c0) 
ft {cO, cl) 
ft (cl, cO) 
U (c0, cl) 
U (c/, cO) 



This allows 2 independent address generators to be used 
depending on the latency of the Compare operations, and on 
whether address-based patterns in our march tests are used 
(such as March -4, March-5, March-7). These address gen- 
eration techniques correspond to step 801 of the method of 
FIG. 8. 

The form of input data required is directly determined by 
which variation of the March tests that we choose to support. 
March-5, March-7, and similar "address-based data pattern" 
march tests are suitable for most custom memories. FIG. 5 
illustrates a 4-bit version of a data generator. 

The data generator of FIG. 5 uses 3 global signals, 
namely: USE-ADDR, EVEN, ODD. The USE-ADDR sig- 
nal is not strictly required for March-5 testing, but it 
provides a low-cost method of turning off the address inputs 
and allowing the use of (unshaded) March-2 as a supple- 
mentary test. The step 802 of the method of FIG. 8 corre- 
sponds to these data generation techniques. Concurrent with 
steps of the method of FIG. 8, the generated data value may 
be stored in a memory location, such as in a RAM, ROM, 
buffer, or any other suitable memory location as recognized 
by one skilled in the art, designated at step 803 in FIG. 8. 

The data generator of FIG. 5 can also be used to generate 
expected data values. The expected data value will generally 
be the same as the data stored, or its inverse. A polarity 
signal is connected to exclusive-or gates in the manner 
indicated to provide the correct polarity of compare data. 
The signals TEXPECT0-TEXPECT3 correspond to step 
804 of the method of FIG. 8. 

Data comparison is used to determine if the values have 
been stored and read properly. Data comparators fall into 
two major classes: (1) fully deterministic data comparators, 
which require a source of expected data, usually from a 
nearby "data generator and address generator pair"; and (2) 
signature analyzers, which do not require any source of 
expected data, but are instead implemented using "multi- 
input linear feedback shift registers (LFSRs)", commonly 
abbreviated as MISRs. If the actual data output is 8 bits or 
less in width, it may make more sense to simply route these 
8 bits, or less, to some multiplexed output pins, rather than 
construct either type of comparator. 

The flip-flops in a deterministic comparator all start with 
all 0's contents, to signify "no mismatches". As soon as a 
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mismatch is detected at a particular bit position, that bit of data. Specifically, the or gate 621 receives compare data 

position's flip-flop stores a 1, which does not change until all from the EXOR gates 622 and 623. Such an implementation 

the flip-flops are reset to all O's again. Such a comparator allows minimization of the test circuitry, while maintaining 

tells us only which bit positions experienced mismatches. a high level of fault coverage. 

We find out about the mismatches when we serially shift-out 5 The major difference between the deterministic data com- 

the contents of the flip-flops. If we wanted to know at exactly parator and the MISR-based comparator is the complete 

which time-step the mismatch occurred, then we must either absence of expected data with the MI SR. This is only a 

shift-out the flip-flops at every time-step (which would be potential advantage when the deterministic comparator 

painfully slow to do, and would invalidate the "at-speed" requires pipelined expected data, because then a second 

nature of all march tests), or add an additional tree of 10 "data generator and address generator pair" must generally 

OR-gates to monitor the flip-flops , outputs (but that would be implemented, for the comparator's exclusive use. The 

only tell us when the first mismatch occurred, since the MISR has its own disadvantages: (a) some small fraction of 

output would remain a steady 1 until the next reset potential mismatches will escape detection because the 

happened), or add an additional tree of EXOR-gates to MISR actually performs lossy data compaction; and (b) a 

monitor the parity of the flip -flops' outputs (but that would 15 genuine "primitive polynomial" must be implemented by the 

fail to show us when 2 simultaneous mismatches happened). MISR, or else the data compaction becomes extremely lossy. 

In addition to generating data in the manner indicated, the The final contents of the MISR, which will be shifted out for 

data generator of FIG. 5 can be used for debugging purposes inspection, is called "the signature". A distinct expected 

by allowing the contents of the memory to be read. For "signature" must be computed or simulated for each (custom 

example, where each bit of memory is compared to an 20 memory, march test) pair that will be used, unlike the 

expected value using an exclusive or gate, and the compare deterministic comparator, where a string of all O's is always 

results are latched in order to be observable external to the the expected "signature". For very large values of b, there is 

device, the data generator of FIG. 5 can allow the contents some concern surrounding the selection of a genuine "primi- 

of the memory to be viewed by asserting all expected tive polynomial". Published lists of primitive polynomials 

outputs to zero. By asserting all expected outputs to zero, the 25 should not be trusted because of potential misprints. Veri- 

exclusive-or comparison of the actual data values to zero fying that a polynomial of order 50 and beyond is indeed 

values asserted by the data generator will result in the stored primitive, is a very non-trivial matter. Therefore, to obtain a 

data values being generated and latched. The expected primitive polynomial representation one should select a 

values TEXPECT0-TEXPECT3 can be forced to zero by verified polynomial of order 32, to be replicated 4 times, to 

driving USE-ADDR, EVEN, ODD, and POLARITY low. 30 obtain 128 bits, (or even 4 verified polynomials of different 

By allowing the contents of internal memories to be orders, that add up to 128, such as 31+33+29+35) instead of 

accessed during a debug mode, with no additional circuitry, a single, unverified polynomial of order 128. FIG. 7 shows 

an advantage is realized over the prior art. a 4-bit MISR, with a middle bit-slice implemented using an 

Another debug feature of the data generator of FIG. 5 is SEDFFTR. FIG. la illustrates four actual data entries each 

its ability to provide specific patterns to a memory. This is 35 provided to a bit-wise exclusive OR operation, designated 

useful when it is desirable to initialize a memory to a specific 720-723. Thereupon, the signal is presented to the 

pattern. By driving the USE-ADDR signal low, it is possible SEDFFTRs, designated 710-713. 

to assert and deassert ODD, and EVEN in order to write all Just like the deterministic comparator, the MISR-based 

ones, all zeros, or alternating one and zero patterns. comparator needs 3 global signals for control, namely: 

FIG. 6 shows how to construct a data comparator using 40 RESET, SHIFT, HOLD-COMPARE, and the usual RCLK. 

SEDFFTR flip-flops 610 and 620 (Scannable, Enableable As illustrated in FIG. 7b, the HOLD-COMPARE, SHIFT, 

D-Flip-Flop, with synchronous Reset). The "expected data" RCLK and RESET are provided to the SEDFFTR 711, as is 

(ED) comes from a nearby data generator circuit. The the output signal from EXOR 721. The final flip-flop's serial 

"actual data" (AD) comes from the custom memory itself. output performs the same role as the deterministic "serial- 

The RCLK is the clock used for all "test collar" circuits, as 45 out mismatch", which is muxed-out on one of a plurality of 

was previously explained. One "serial-out mismatch" signal output pads. Whether or not a feedback TAP appears at any 

must be routed to a multiplexed output pad; in the case of given bit location is determined by the coefficients of the 

other embodiments of the present invention, multiple scan- chosen primitive polynomial. 

out pads will be re-used by multiplexing the various "serial- Further extensions are possible: (1) the primitive polyno- 

out mismatch" signals through these multiple pads. A few 50 mial can be made wholly or partly programmable, by adding 

input pads, for example 3, will act as mux-select signals, that extra flip-flops and AND-gates to selectively control the 

will determine which group of "serial-out mismatch" signals optional feedback TAPs; this would involve adding 2 more 

are observable at any time. Besides the serial-out signal, this global signals as well, (a) to control the shifting and (b) to 

comparator also needs 3 global input signals for control, supply the serial-in coefficient stream. (2) the MISR-based 

namely: SHIFT, RESET, HOLD-COMPARE. 55 comparator may be combined with a deterministic 

Note that the first flip-flop's 610 SI pin has a Logic-1 comparator, by placing a 2-to-l mux in front of the D pin of 

driving it. This provides the rudimentary ability to check the each SEDFFTR flop, and by using an extra global signal to 

operation of the flip-flop's shifting. RESET loads all the select either the MISR's EXOR-gate for each bit, or the 

flip-flops with O's, which are shifted out in b time-steps, deterministic EXOR with OR-gate combination for each bit. 

followed by l's thereafter as illustrated with OR gate 611 60 Such comparator techniques are represented by step 805 of 

that ORs the flip-flop 610 b output and the output of the OR FIG. 8. 

gate 611. The above-circuit happens to use an SEDFFTR for The table below shows how many globally broadcast 

each output bit, but it could also be constructed using an signals are required for each possible combination of (a) 

SDFFTR, where the HOLD-COMPARE function would fault coverage (F.C.) type: maximum (march, butterfly and 

now be implemented by adding an AND-gate between the 65 galpat, plus flexible data generator), or minimum (march 

EXOR and OR-gates. In order to reduce the size of the test tests only, and minimal data generator); (b) comparator type: 

circuitry, the second flip-flop is associated with multiple bits deterministic, or MISR-based; (c) expected data type 
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(applies to deterministic comparator only): shared, or pipe- 
lined. All the rows refer to input-direction pads needed, 
except for the row entitled "shared-scan-outs", which are 
output-direction. 



MISR-based 
Deterministic Comparator Compare. 

Snared Pipelined Ex- (No Expect 





Expect Data 


pect 


Data 


Data 


Used^ 




Max. 


Min. 


Max. 


Min. 


Max 


Min. 




F.C. 


F.C. 


F.C. 


F.C. 


F.C. 


F.C. 


global control 


5 


5 


5 


5 


5 


5 


first addr. gen. 


5 


2 


5 


2 


5 


2 


first data gen. 


3 


2 


3 


2 


3 


2 


2nd addr. gen. 


0 


0 


5 


2 


0 


0 


second data gen. 


0 


0 


3 


2 


0 


0 


data comp. 


3 


3 


3 


3 


3 


3 


shared-scan- out 


8 


8 


8 


8 


8 


8 


scan- mux select 


3 


3 


3 


3 


3 


3 


Totals 


27 


23 


35 


27 


27 


23 



In addition to the above global signal requirements, there 
are a few more types of testability-related top-level signals, 
such as: (1) all the test- mode clocks, for which a device - 
dependant number will exist, (2) some clock-configuration 
signals, whose number is also device-dependant, and (3) the 
TESTEN signal, which is 1 pin. 

FIG. 8 illustrates a method in accordance with the present 
invention. Each of the individual steps of FIG. 8 has been 
discussed with reference to the specific implementations of 
FIGS. 2-7. 

It should now be apparent that the present invention 
provides increased testability and flexibility for testing 
embedded memory systems. Furthermore, it should be 
understood that the present invention has been described 
with reference to specific embodiments. As such, other 
embodiments of the specific implementation are anticipated 
herein. For example, as discussed with reference to FIG. 6, 
four operations (Reset, Compare, Hold, Shift-Out) are asso- 
ciated with the latches 610 and 620. Therefore, an imple- 
mentation receiving a two-bit control signal could be imple- 
mented to replace the illustrated three control signal 
implementation. (RESET, SHIFT, HOLD-COMPARE of 
FIG. 6) 

I claim: 

1. A method for testing a memory, the method comprising: 
providing a data value from a first data generator, wherein 

the first data generator receives a first address value 
from a first address generator and generates the data 
value based upon the first address value; 

storing the data value at a memory location; 

providing a compare value from a second data generator, 
wherein during a first mode of operation, the second 
data generator receives a second address value from a 
second address generator and generates the compare 
value based upon the second address value, wherein the 
first data generator is the same as the second data 
generator; and 

comparing the compare value to the data value stored at 
the memory location. 

2. The method of claim 1, further comprising the step of: 
providing the first address value by receiving a primitive 

polynomial representation, and operating the first 
address generator based upon a pseudo-random bit- 
stream associated with the primitive polynomial rep- 
resentation. 



)4,461 Bl 

16 

3. The method of claim 1, wherein the step of providing 
a compare value further comprises the substep of during a 
second mode of operation, the second data generator gen- 
erates a predetermined value allowing the contents of the 

5 memory location to be determined. 

4. A method for testing a memory, the method comprising: 
receiving an address value from an address generator; 
generating a data value based upon the address value; 

1Q providing the data value to a memory location; 

receiving a mode indicator, wherein when the mode 
indicator is in a first mode, the data value is used to test 
the memory associated with the memory location for 
address decoder faults; and 

is when the mode indicator is in a second mode, the data 
value is used to test the memory associated with the 
memory location for coupling faults within memory 
locations. 

5. The method of claim 4, wherein the step of generating 
20 further includes: 

generating the data value based upon an exclusive-or 
result. 

6. The method of claim 5, wherein the step of generating 
further includes: 

25 selectively enabling the exclusive-or result using a first 
qualifier to generate the data value. 

7. The method of claim 6, wherein the step of generating 
further includes: 

3Q selectively enabling the exclusive-or result using a second 
qualifier to generate the data value. 

8. The method of claim 4, further comprising: 
generating a compare value based upon the address value; 

and 

35 changing a polarity of compare value to provide an 
expected data value. 

9. An integrated test system for verifying memory, the 
system comprising: 

a first gate having a first input to receive a memory value 
40 associated with a first memory location, a second input 
to receive a first expected value, and an output; 
a second gate having a first input to receive a memory 
value associated with a second memory location, a 
second input to receive a second expected value, and an 
45 output; 

a first latch having a data input port and a data output port; 
a first logic function having a first input coupled to the 
output of the first gate, a second input coupled to the 
5Q output of the second gate, a third input to receive a 
data-bit from the data output port of the first latch, and 
an output to provide a data-bit to the data input port of 
the first latch; 

a third gate having a first input to receive a memory value 
5S associated with a third memory location, a second input 

to receive a third expected value, and an output; 
a second latch having a data input port and a data output 

port; and 

a second logic function having a first input coupled to the 
60 output of the third gate, a second input coupled to the 
output port of the second latch, and an output coupled 
to the data input port of the second latch. 

10. The system of claim 9, wherein the first latch is a 
flip-flop and the first logic function is an OR-function. 

65 11. The system of claim 9, wherein the first and second 
memory values are representative of data accessed by the 
first and second memory locations. 
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12. An integrated test system for verifying memory, the 
system comprising: 

a data generator having: 
a plurality of inputs to receive an address value, 
a plurality of outputs to provide a generated data value, 
and 

at least one control input to control the generated data 
value; 

a memory to receive the generated data value, wherein 
the at least one control input includes an enable input 
to enable generation based on an incoming address 
value; 

a first logic gate having a first input to receive a first bit 
of the address value, a second input to receive an 
enable signal, and an output; 

a second logic gate having a first input to receive a 
second bit of the address value, a second input to 
receive the enable signal, and an output; 

a third logic gate having a first input to receive a third 
bit of the address value, a second input to receive the 
enable signal, and an output; 

a first exclusive-or gate having a first input to receive 
the first bit, and a second input to receive a first input 
of the at least one control input; 

a second exclusive-or gate having a first input to 
receive the second bit, and a second input to receive 
a second input of the at least one control input; and 

a third exclusive-or gate having a first input to receive 
the third bit, and a second input to receive the first 
input of the at least one control input. 

13. The system of claim 12, wherein the at least one 
control input includes an exclusive-or enable input to 
modify the generated data value. 
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14. The system of claim 12, wherein the at least one 
control input includes multiple exclusive-or enable inputs to 
modify the generated data value. 

15. A method of testing an integrated memory, the method 
5 comprising the steps of: 

when in a first mode of operation: 

receiving a primitive polynomial representation; and 
operating an address generator based upon the primi- 
tive polynomial representation; 
when in a second mode of operation: 

performing at least one of a March -2 test and March-3 
test. 

16. The method of claim 15, wherein the primitive poly- 
15 nomial representation is formatted as a serial bit stream. 

17. The method of claim 15, wherein the step of operating 
includes shifting the primitive polynomial representation 
into the address generator. 

18. The method of claim 17, wherein the representation is 
2Q software generated. 

19. The method of claim 15, further comprising: 
when in a third mode of operation performing at least one 

of a March-4 test and March-5 test. 

20. The method of claim 19, further comprising: 

25 when in a fourth mode of operation performing a Butterfly 
test. 

21. The method of claim 20, further comprising: 
when in a fifth mode of operation performing a Galpat 

test. 

30 22. The method of claim 15, wherein shift-registers are 
the only address sources. 

***** 
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